Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

نویسندگان

  • Liang Ding
  • Abdul Samad
  • Xingran Xue
  • Xiuzhen Huang
  • Russell L. Malmberg
  • Liming Cai
چکیده

Stochastic context-free grammar (SCFG) has been successful in modeling biomolecular structures, typically RNA secondary structure, for statistical analysis and structure prediction. Context-free grammar rules specify parallel and nested co-occurren-ces of terminals, and thus are ideal for modeling nucleotide canonical base pairs that constitute the RNA secondary structure. Stochastic grammars have been sought, which may adequately model biomolecular tertiary structures that are beyond context-free. Some of the existing linguistic grammars, developed mostly for natural language processing, appear insufficient to account for crossing relationships incurred by distant interactions of bio-residues, while others are overly powerful and cause excessive computational complexity. This paper introduces a novel stochastic grammar, called stochastic k-tree grammar (SkTG), for the analysis of context-sensitive languages. With the new grammar rules, co-occurrences of distant terminals are characterized and recursively organized into k-tree graphs. The new grammar offers a viable approach to modeling context-sensitive interactions between bioresidues because such relationships are often constrained by k-trees, for small values of k, as demonstrated by earlier investigations. In this paper it is shown, for the first time, that probabilistic analysis of k-trees over strings are computable in polynomial time n. Hence, SkTG permits not only modeling of biomolecular tertiary structures but also efficient analysis and prediction of such structures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Energy Scheduling in Power Market under Stochastic Dependence Structure

Since the emergence of power market, the target of power generating utilities has mainly switched from cost minimization to revenue maximization. They dispatch their power energy generation units in the uncertain environment of power market. As a result, multi-stage stochastic programming has been applied widely by many power generating agents as a suitable tool for dealing with self-scheduling...

متن کامل

Data-Oriented Parsing

1. A DOP model for phrase-structure trees R. Bod and R. Scha 2. Probability models for DOP R. Bonnema 3. Encoding frequency information in stochastic parsing models 1. Computational complexity of disambiguation under DOP K. Sima'an 2. Parsing DOP with Monte Carlo techniques J. Chappelier and M. Rajman 3. Towards efficient Monte Carlo parsing R. Bonnema 4. Efficient parsing of DOP with PCFG-redu...

متن کامل

A survey on random walk-based stochastic modeling in eukaryotic cell migration with emphasis on its application in cancer

Impairments in cell migration processes may cause various diseases, among which cancer cell metastasis, tumor angiogenesis, and the disability of immune cells to infiltrate into tumors are prominent ones. Mathematical modeling has been widely used to analyze the cell migration process. Cell migration is a complicated process and requires statistical methods such as random walk for proper analys...

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

A survey on random walk-based stochastic modeling in eukaryotic cell migration with emphasis on its application in cancer

Impairments in cell migration processes may cause various diseases, among which cancer cell metastasis, tumor angiogenesis, and the disability of immune cells to infiltrate into tumors are prominent ones. Mathematical modeling has been widely used to analyze the cell migration process. Cell migration is a complicated process and requires statistical methods such as random walk for proper analys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014